Before analysis, users should consider conducting finer-scale filtering in order to clean the NestWatch dataset after running nw.cleandata. This may include selecting certain species, identifying specific nest phenology dates (ie. incubation should not last longer than X days for species Y), or limiting nest attempts to a certain geographic area.

Filter Species

Limiting the dataset to just a few species can easily be done using the pipe (%>%). If you are unfamiliar with “piping”, see the migritrr package. Below we will subset the merged.data dataframe produced in the Intro vignette to include only attempts for Bewick’s Wren (“bewwre”, n = 15,339 atte ) and will further subset those nests to improve map rendering speed here.

# Filter data to include only carwr and bewwre
wrens <- merged.data %>% filter(Species.Code %in% c("carwre", "bewwre"))

# View what species are in the new dataset
unique(wrens$Species.Name)
> [1] "Carolina Wren" "Bewick's Wren"

# Subset dataframe to get just a few columns of interest

Filter Spatially

Spatial filters are a flexible way to limit data to a predefined geographic area. A user may choose to limit an analysis to nesting attempts within a single Bird Conservation Region or a select number of states. Or one may choose to clean likely misidentified species by using a rangemap filter. If those identifying criteria are easily subset from the dataset, like states and countries (via Subnational.Code), a user may user subsetting rules to filter their data for analysis. If the criteria not already subsettable, a spatial filter can be used.

As an example, we can first view a plot of where the nests in wrens are located by species. Here we will use tmap to produce an interactive map, but other packages including mapview also provide interactive mapping interfaces. Because these data are large, we will first select a random subset of nests so interactive plotting will render effectively.

We will be utilizing the sf package to help create and transform our spatial data. Here we will project the wrens data into the Lambert Conformal Conic Projection, which is well suited for mapping areas in the United States. Note: When plotting any spatial data, please be careful to maintain the correct CRS (coordinate reference system) by projecting unprojected data

library(mapview)
mapview(nest_points, 
        zcol = "Species.Name", 
        alpha.regions = 0.4,
        cex = 4,
        legend = TRUE)

By looking at this map, we can see that there are several suspicious nests identified as Bewick’s Wrens in the eastern US as well as one nest in Great Britain! Bewick’s Wrens are not typically recorded eat of the Mississippi River, so some of these records could be misidentified. We could decide on a subset of states/provinces to use to filter out-of-range nest attempts. But a better method might be to filter nest locations based on a range map.

eBird Range Polygons

The eBird Status and Trends Products contain a wealth of information on bird populations. One of the available products are range maps of species for which Status and Trend Models have been run. These data are easily accessible in R through the ebirdst package. To access these eBird data, you will need to acquire a free access key. This key will give you access to Status and Trends Data within R. For more information and to aquire an access key, see the documentation here.

We can use our unique access key to download the range map of Bewick’s Wren and Carolina Wren. Note, you will need the species codes of those species you would like to download, not their alpha code or common name. By modifying the access key, species, and download location in the code below, you can download and open the range polygons to your global environment. This code also selects only the breeding range layer if available, and if not available selects the year-round range layer.

# obtain and set an ebird access key
set_ebirdst_access_key("pasteyourkeyhere")      # you only need to do this once, R will remember it

# Define what species you want to download by their code
spp <- c("bewwre", "carwre")

# Specify where the data will be downloaded to
# Here we will create a folder "spatial" in our working directory:
spatialdata_path <- c("spatial")  

# Download range maps by species
for (i in spp) {
  ebirdst_download_status(species = i, download_abundance = FALSE, 
                          download_ranges = TRUE, pattern = "_smooth_27km_", 
                          path = spatialdata_path)
}

# Read in the range files
for (i in spp) {
  # Generate the path to the .gpkg files
  file_path <- paste0(spatialdata_path, "/2022/", i, "/ranges/", i, "_range_smooth_27km_2022.gpkg")
  # Read in the .gpkg file
  range_data <- st_read(file_path)
  # Generate the name for the object
  object_name <- paste0(i, "_range")
  # Assign the value to the dynamically generated object name
  assign(object_name, range_data)
  rm(range_data)
}

# Select just breeding layer if available, else resident layer
object_names <- paste(spp, "range", sep = "_")
for (i in object_names) {
  if (i %in% ls(envir = .GlobalEnv)) {
    data <- get(i, envir = .GlobalEnv)
    if (any(data$season %in% "breeding")) {
      data <- data %>% filter(season == "breeding")
      data <- data %>% st_transform(nest_points, crs = mollweide)
      assign(paste0(i), data, envir = .GlobalEnv)
    } else {
      data <- data %>% filter(season == "resident")
      data <- data %>% st_transform(nest_points, crs = mollweide)
      assign(paste0(i), data, envir = .GlobalEnv)}
    rm(data)
  }
}

# Clean up intermediate objects
rm(file_path, i, object_name, object_names, spatialdata_path)